home *** CD-ROM | disk | FTP | other *** search
Text File | 1994-09-04 | 58.5 KB | 1,054 lines |
-
- CHAPTER 4
-
- DEBUGGING STRATEGIES
-
-
- There are many individual components which contribute to a completed
- application. The logical flow of the program must be determined, the user
- interface must be designed, and appropriate algorithms must be selected.
- But no matter how much effort you devote to the design and implementation
- of a program, the bottom line is it must also work correctly.
- In an ideal scenario, you would begin writing a program by first
- jotting down some notes that describe its operation. Next, you would
- create an outline listing each of the program's major components. You
- would then determine all of the subroutines and functions that are needed,
- and perhaps even create a flow chart showing each of the paths that could
- be taken. Properly prepared for any situation that might arise, you
- finally write the actual code and find that it works perfectly. Now,
- what's wrong with this picture? Few people actually program that way!
- In practice, many programmers simply start coding with little
- forethought and no detailed plan. They begin with the first statement and
- continue to the last, occasionally reworking portions into subroutines as
- necessary. After all, planning is not nearly as much fun as programming,
- and everyone knows that fun is the most important part. Believe it or not,
- I agree. There's nothing really wrong with plodding through a program,
- stabbing here and there until it works. Indeed, some great algorithms
- developed out of aimless doodling. I have personally never drawn a flow
- chart, and I have no plans to start now.
- What I will address here is how to find and correct problems when they
- do occur. There are more things that can go wrong with a program than can
- go right, and tracking down an elusive "Illegal function call" error that
- appears only occasionally is definitely not much fun. How quickly you can
- solve these problems is directly related to your understanding of
- programming in general, and to your familiarity with the tools available.
- In this chapter you will learn how to identify problems in your
- programs, and also how to solve them. Programming errors, or bugs, can be
- as simple as a misspelled variable name, and as complex and ornery as an
- internal flaw in BASIC itself. The BASIC editing environment provides a
- wealth of powerful debugging features, and understanding how to use them
- will help you produce programs that are reliable and error free.
-
-
- COMMON PROGRAMMING ERRORS
- =========================
-
- There are three distinct types of programming errors: simple misspellings
- and other naming or syntax errors, incorrect logic such as misunderstanding
- or incorrectly coding an algorithm, and failing to understand some of the
- finer points of the BASIC language. No matter how carefully you type, no
- matter how much forethought you apply to a particular problem, and no
- matter how often you read the BASIC manuals, it is impossible to completely
- avoid making mistakes.
- The first category includes those errors caused by simple mistakes
- such as misspelling a variable or procedure name. Trying to call a
- subprogram that doesn't exist will be immediately obvious, because BASIC
- gives you an error message before the program can be run. But an incorrect
- variable name will return the wrong results with no warning.
- Passing the wrong number of arguments to a procedure may or may not be
- reported, depending on whether the routine has been declared. Assembly
- language routines in a Quick Library can be particularly pesky in this
- regard. Although BASIC automatically generates a DECLARE statement for
- BASIC subprograms and functions you have loaded in source form, it does not
- do this for routines in a Quick Library. If you call an assembly language
- routine incorrectly, you will probably crash the PC. However, it is also
- possible to corrupt string memory and not know it. Worse, a "String space
- corrupt" error is often not reported until much later in the program. If
- you run the short program below in the QuickBASIC 4.5 editor, it will
- appear to operate correctly.
-
-
- X$ = SPACE$(1000) 'create a string
- POKE SADD(X$) - 2, 100 'corrupt string memory
- PRINT "Testing"
- X% = 1
- PRINT "More testing"
- X% = 2
- PRINT "Yet more testing"
- X% = 3
-
-
- Here, the POKE statement is overwriting the back pointer that belongs to
- X$, which is one type of string corruption that can occur. But QuickBASIC
- doesn't know that this has happened, because it has no reason to check the
- integrity of its string memory until another string assignment is made.
- However, adding the statement PRINT FRE("") anywhere after the POKE command
- causes BASIC to check string memory, and report the error. Even if your
- program does not use POKE, calling a procedure incorrectly can cause it to
- overwrite memory in this fashion.
- Another simple error is inadvertently using the same variable name
- twice, or omitting a type declaration character from a variable name. For
- example, if you are using a variable named Bytes& to track how many bytes
- of a file have been read, accidentally using Bytes later on will give the
- wrong results. If a DEFINT statement is in effect, then Bytes will be an
- integer variable. Otherwise, it will be single precision which is also
- incorrect. Unless you use the DIM...AS statement to declare a variable
- explicitly, BASIC lets you have different variables with the same name.
- That is, Var%, Var!, and Var# can all coexist in the same program, and each
- is a unique variable.
- Similarly, using the wrong variable entirely will cause your program
- to operate incorrectly, and again with no error message displayed. More
- than once I have had a program with one FOR loop nested within another, and
- used the outer loop counter variable when I meant to use the inner one.
- Another common situation is caused by changing the name of a variable
- during the course of writing a program. For example, you may have a
- variable named BPtr that tracks where you are reading within a buffer. If
- you later decide to change that name to BufPointer because it is more
- meaningful, you must also remember to change all occurrences of the name.
- Of course, BASIC's search and replace feature minimizes that problem. More
- important, though, you must make a mental note to use the new name as you
- continue to develop the program.
- Forgetting to declare a function can also lead to incorrect results
- that produce no warning. If an integer function is not declared, then
- BASIC will dimension an array with that name if the function expects a
- numeric argument. When BASIC encounters the statement X = FuncName%(Y%) it
- assumes that FuncName% is an integer array, and create an array containing
- the default 11 elements. In this case X will be assigned a value of zero,
- or you will receive a "Subscript out of range" error if Y% is not between 0
- and 11. I once observed an unexplainable "Out of string space" error that
- was caused by the statement Size = ScreenSize%(ULRow, ULCol, LRRow, LRCol).
- ScreenSize% was a function present in a Quick Library, but without a
- DECLARE statement BASIC created a 4-dimensional integer array.
-
-
- LOGIC ERRORS
- ============
-
- The second cause of bugs is logic errors, and these include adding when you
- meant to subtract, or using the wrong variable altogether. Programs that
- manipulate pointers (variables that hold the addresses of other variables)
- are particularly prone to errors in logic. Another common logic error is
- forgetting to trim the leading or trailing blanks from a file or directory
- name before using it. If the operator enters " c:\thisfile.dat" and you
- try to open that file, BASIC will report a "Bad file name" error.
- Another cause of logic errors is failing to consider all of the things
- a user may enter. An inexperienced operator is likely to enter data that
- you as the programmer would never consider, or select menu items in an
- order that makes no sense. Indeed, never underestimate the value of beta
- testers. After you have exhausted all of the possibilities you can think
- of, give the program to a 4 year old child, and ask him or her to try it
- while you watch. Your uncle Ernie would be a good beta tester too, and the
- less he knows about your program, the more valuable his contribution will
- be. People who know absolutely nothing about computers have an uncanny
- knack for creating "Illegal function call" errors in a program that you
- just know is perfect.
- Similarly, you must consider all of the possible error conditions that
- could happen in a program. In an error handler that has a CASE statement
- for each possibility you anticipate, also include a CASE ELSE clause for
- those you haven't thought of. The short listing that follows shows a
- typical error handler that incorporates this added safety measure.
-
-
- ON ERROR GOTO HandleErr
- ...
- ...
- HandleErr:
- SELECT CASE ERR
- CASE 7, 14
- PRINT "Out of memory"
- CASE 24, 25, 27
- PRINT "Fix the printer"
- CASE 53
- PRINT "File not found"
- CASE ELSE
- PRINT "Error number"; ERR
- END SELECT
- ...
- ...
-
-
- The CASE ELSE clause lets you accommodate any possibility, and your user
- can then at least report to you what the error number was. This simple
- example doesn't include all of the possibilities, but you can certainly see
- the general concept.
- Another common logic error is using the same file number twice. When
- a file has been opened as #1, that number remains in use until the file is
- closed. This can be problematical when writing reusable modules, since
- there is no way to know which files may be in use by the main program.
- Some programmers use #99 or another unlikely number in a routine that will
- be reused in many programs. But even that approach is flawed, because you
- have to remember which numbers are used by which routines.
- BASIC's FREEFILE function is intended to solve this problem, and it
- returns the next available file number. Be sure to save the results
- FREEFILE returns, however, since the value will change as soon as the next
- file is opened. The code below shows both the wrong and right ways to use
- FREEFILE.
-
-
- Wrong:
-
- OPEN "accounts.dat" FOR INPUT AS #FREEFILE
- INPUT #FREEFILE, X$ 'FREEFILE has changed!
- CLOSE #FREEFILE
-
-
- Right:
-
- FileNum = FREEFILE 'get and save the number
- OPEN "accounts.dat" FOR INPUT AS #FileNum
- INPUT #FileNum, X$
- CLOSE #FileNum
-
-
- In the first example if FREEFILE returns, say, a value of 2, then it will
- return 3 at the INPUT statement which is of course incorrect. Therefore,
- you must save the value FREEFILE returns, and use that for all subsequent
- file accesses. This situation also occurs with INKEY$, because once a
- character has been returned it is no longer available unless you saved it.
- Two other frequent problems are attempting to use LSET to assign
- characters into a string that does not exist, and failing to clear a
- counter variable within a static subprogram or function. The second
- problem can be especially frustrating, because the routine will work
- correctly the first time it is invoked. In the function below, a counter
- returns the number of embedded control characters it finds in a string.
-
-
- FUNCTION CtrlCount%(Work$) STATIC
-
- FOR X% = 1 TO LEN(Work$)
- IF ASC(MID$(Work$, X%, 1)) < 32 THEN
- Count% = Count% + 1
- END IF
- NEXT
-
- CtrlCount% = Count% 'return the count
-
- END FUNCTION
-
-
- The problem here is that Count% retains its value between function
- invocations. Therefore, each time CtrlCount% is used it will return ever
- higher values. One solution is to add the statement Count% = 0 at the
- beginning of the function. Another is to omit the STATIC option from the
- function definition.
-
-
- UNDERSTANDING BASIC'S QUIRKS
-
- The third type of error is caused by not understanding some of BASIC's
- finer points and quirks. For example, some people do not realize that
- omitting the third argument from MID$ causes it to return all of the
- remaining characters in a string. To see if a drive letter was given as
- part of a file name and if so extract it, you might use a statement such as
- IF MID$(FileName$, 2) = ":" THEN Drive$ = LEFT$(FileName$, 1). But since
- the number of characters was not specified to MID$, it returned all but the
- first character in the string. Unless the string was a drive letter and
- colon only ("C:"), the test for a colon could never work. The solution, of
- course, is to use MID$(FileName$, 2, 1).
- Another instance in which an intimate knowledge of BASIC's
- idiosyncracies comes into play can affect the earlier example of a file
- name that contains leading blanks. Most programmers do not use INPUT to
- accept information, unless the program is very simple and it will be used
- only occasionally. However, asking for a file name with INPUT is one way
- to avoid that problem, because INPUT strips all leading and trailing blank
- spaces, as well as CHR$(9) tab characters. The more useful LINE INPUT, on
- the other hand, does not strip leading blanks and tabs. Most programmers
- would never be so foolish as to enter a file name with leading blanks. So
- this is yet another situation where it is important to consider all of the
- possibilities.
- It is also possible to crash a program by using the ASC function when
- the string might be null. Again, *you* would never press Enter alone in
- response to a prompt for a file name or other mandatory information, but
- someone else might.
- Another BASIC quirk is caused by rounding errors. As you saw in
- Chapter 2, adding or multiplying many numbers in succession can produce
- results that are not precisely correct. Instead of checking to see if a
- value is zero, it is often better to compare it to a very small number.
- That is, instead of IF Value# = 0 you would use IF Value# < .000001 or IF
- Value# < .000001 AND Value# > -.000001 or something similar. Also, some
- numbers simply cannot be represented at all. If you try to enter the
- statement X# = .00000000001 in the QuickBASIC 4.5 editor, the value will be
- converted to 9.999999999999999D-12 as soon as you press Enter.
- Although not technically a BASIC quirk, many programmers forget that
- variables within a DEF FN function are by default global. Unless you
- include an explicit STATIC statement listing each variable that is to be
- local to the function, it is likely that an unexpected change will be made
- to a variable in the main program.
- Some programming situations require that you obtain the address of a
- string variable using SADD. However, SADD is not legal for use with a
- fixed-length string or the string portion of a TYPE variable. More
- important, when using BASIC PDS far strings you must also remember to use
- SSEG to get the string's data segment. Using VARSEG will not create an
- error; however, the program will not work correctly.
- Related to that, it is important to remember that strings and dynamic
- arrays move around in memory--often at unexpected times. The program below
- appends a zero character to one string for each zero that is found in
- another string. Since BASIC may move Work$ during the course of assigning
- Zero$, this code will fail eventually:
-
-
- Address = SADD(Work$)
- FOR Y = Address TO Address + LEN(Work$) - 1
- IF PEEK(Y) = 48 THEN Zero$ = Zero$ + "0"
- NEXT
-
-
- Another particularly insidious bug can result if you inadvertently add
- parentheses around a variable that is passed to a subprogram or function.
- In the example below, a subprogram that intentionally modifies a parameter
- has been declared and is then called without the CALL keyword.
-
-
- DECLARE SUB Square(Param%)
- Square (Value%)
-
- SUB Square(Value%) STATIC
- Value% = Value% * Value%
- END SUB
-
-
- Because of the unnecessary and incorrect use of parentheses, a copy of the
- argument is sent to Square instead of the argument itself, with the result
- that Value% is never actually changed. The fix is to either remove the
- parentheses, or add the word CALL. Another, related issue is placing a
- DEFINT after DECLARE statements. In the example below, the parameters X,
- Y, and Z are assumed by BASIC to be single precision, even though this is
- clearly not what was intended.
-
-
- DECLARE SUB (X, Y, Z) 'X, Y, and Z are singles!
- DEFINT A-Z
- .
- .
-
-
- The final issue I want to address here is potential overflow errors. The
- statement IF IntVar% * 14 > 1000000 can never be true, because BASIC
- performs integer math assuming an integer range only. Unless you compile
- your program using the /d debug option, the error will be unreported in a
- compiled program. If this statement is executed within the QB environment,
- BASIC will report an overflow error, even though the instruction certainly
- appears to be legal. But since integer math assumes an integer result, the
- product of IntVar% times 14 will overflow the range of integer values if
- IntVar% is greater than 2,340.
- One solution is to use a long integer for IntVar, and BASIC will then
- use the range of long integers for the comparison. Using a long integer
- wastes memory, however, and calculations on long integers are slower and
- require more code to implement. A much better solution is to use CLNG
- (Convert to Long), which tells BASIC to assume a long integer result.
- The statement IF CLNG(IntVar%) * 14 > 1000000 will create a long
- integer version of IntVar%, and then multiply the result times 14 and use
- that for the subsequent comparison. Unlike the copies that BASIC makes
- which steal DGROUP memory, the long integer conversion in this instance is
- handled within the CPU's registers. CLNG when used this way is really just
- a compiler directive, as opposed to a called library routine. Another
- solution is to add an ampersand after the constant 14, thus: IF IntVar% *
- 14& > 1000000. Again, no additional DGROUP memory is used to handle 14 as
- a long integer value.
- Another interesting use of CLNG and CINT--unrelated to debugging but
- worth mentioning none the less--is to reduce the size of comparison code.
- When you use a statement such as IF X% > VAL(Some$), a floating point
- comparison is performed even if Some$ holds an integer value. By replacing
- that example with IF X% > CINT(VAL(Some$)) 6 bytes of code can be saved.
- The CINT tells BASIC that it will not have to perform any floating point
- rounding when it compares the two values.
-
-
- DEBUGGING AND TESTING TECHNIQUES
- ================================
-
- When you are developing a large application that is comprised of many
- individual modules, there are several useful debugging techniques you can
- employ. One is to create short test-bed programs that exercise each
- subprogram and function. Finding an error in a complex program with many
- interdependencies between subroutines can be a tedious prospect at best.
- If you instead create a small program whose sole purpose is to test a
- particular subprogram, you will be better able to focus on just that
- routine.
- Another useful technique for detecting and preventing sporadic errors
- is to test your code on "boundary conditions". If you have a routine that
- reads and process a file in 4K (4096 byte) increments, test it with a file
- that is exactly 4096 bytes long, as well as with other test files that are
- 4095 and 4097 bytes long.
- Perhaps nothing is more frustrating than having a program fail with
- the message "xxx at line No line number". This message is a throw-back to
- the days when all BASIC programs had to use line numbers. Now that line
- numbers are not required in modern compiled BASIC, most programmers do not
- use them, opting instead for more descriptive line labels when labels are
- needed at all. When an error does occur and the program has been compiled
- with /d, BASIC reports the number of the nearest numbered line preceding
- the line in which the error occurred.
- A good solution to track down the cause of such errors is to use a
- variant on a hardware debugging technique known as the "cut in half"
- method. In a complex electronic circuit that does not work, using this
- technique means that the circuit is first checked at its mid-point for the
- correct signal. If the circuit tests correctly at that point, then the
- error is in the second half. Therefore, the test engineer would "cut in
- half" again, and test at a point halfway between the middle and the end.
- If the test fails there, then the problem must lie between the middle of
- the circuit and that point.
- In a purely software situation, you would add a line number to a line
- that falls approximately half-way through the program. If that number is
- reported, then the problem is occurring in the second half of the program.
- An enhancement to this technique that I recommend is to add, say, ten line
- numbers in evenly spaced increments throughout the program. This will let
- you quickly isolate the problem to a much smaller portion of the program.
- Besides the line number (or lack of line number) that BASIC reports,
- the segment and address at which the error occurred is also reported. This
- is information is frankly useless in a purely BASIC environment. You must
- either use CodeView to identify the line that is associated with the error,
- or view the assembly language output that BC can optionally generate.
- These will be described in the section on advanced debugging later in this
- chapter.
- Finally, it is important to point out that you should never use ON
- ERROR while a program is being developed. ON ERROR can hide programming
- errors that you need to know about. As an example, a LOCATE statement with
- incorrect values will generate an "Illegal function call" error. But if ON
- ERROR is in effect and your program uses RESUME NEXT for errors it is not
- expecting, you may never even know that an error occurred. If you run the
- complete program below you can see that there is no indication that an
- error occurred at the obviously illegal LOCATE statement.
-
-
- CLS
- ON ERROR GOTO HandleErr
- LOCATE 100, -90
- PRINT "My program seems to work fine."
- END
-
- HandleErr:
- RESUME NEXT
-
-
- USING THE QB AND QBX EDITING ENVIRONMENTS
-
- The single most powerful debugging feature that is available to you is the
- BASIC editing environment. More than just an editor that you can use to
- enter program statements, the QB environment is exactly that: a complete
- editing environment for developing and testing BASIC programs. The BASIC
- editor lets you enter program statements, single-step through a program,
- examine variable values, and much more. Besides being able to execute
- commands singly and in sequence, you can also trace into subroutines and
- functions, and even run your program in reverse.
- The primary advantage of using the QB environment instead of a
- separate editor is the enhanced debugging capabilities. In most high-level
- languages, you first write a program using an editor, and then compile and
- run it to see if it works correctly. If an error occurs, you must start
- the editor again, load your program, and study the code to see what went
- wrong. In contrast, QB lets you run your program at the same time it is
- being edited. You can even modify the program while it is running and then
- resume execution, view and change variable values, and change the order in
- which statements are executed.
- Further, BASIC can be instructed to stop and return to the edit mode
- when the program reaches a certain statement, or when a particular logical
- condition becomes true. For example, you can tell BASIC to halt the
- program when a variable takes on a specified value. These are extremely
- powerful debugging tools which have no equal in any other language. In the
- sections that follow, I will describe each of these capabilities in detail.
-
-
- STEP AND TRACE DEBUGGING
-
- Early versions of Microsoft BASIC offered a very primitive trace capability
- that displayed the line numbers of the currently executing statements.
- Although this was better than nothing, interpreting a blur of line numbers
- flashing by on the screen required a lot of mental effort. When Microsoft
- introduced QuickBASIC version 3.0 they added greatly improved debugging in
- the form of a step and trace feature. To activate step and trace you would
- enter a STOP statement at a selected point in the source code. When the
- program reached that point you could then execute each statement in
- sequence by pressing a function key. QuickBASIC 3 also provided the
- ability to display continuously the value of a single variable in a window
- at the top of the screen.
- QuickBASIC 4.0 offered an improved version of this feature, using
- additional function keys to control how a program proceeds. This method
- has been continued with little change through current versions of
- QuickBASIC and BASIC PDS. Of course, the primary reason you would want to
- step through a program one statement at a time is to determine why it is
- not working. For example, if you have code that opens a file for output
- but the file is never created, you would step through that portion of the
- code to see which statements are being executed and which are not. In
- particular, stepping through a program lets you see which path an IF or
- CASE test is taking.
- Two function keys are used to single-step through a program, and four
- additional options are available to assist program debugging. Each time
- the F10 key is pressed, the current statement is executed and the program
- advances to the next statement. If you have just loaded the program being
- tested, you will press F10 once to get to the first instruction. Pressing
- F10 again executes that statement, and continues to the next one. If the
- current statement is related to screen activity, the screen is switched
- momentarily to display the program's output rather than the source code.
- The screen is also switched during a CALL statement or function invocation,
- in case that routine performs screen output. You can optionally toggle
- between viewing the output and edit screens manually by pressing F4.
- In some cases you may want to treat a subroutine as a single
- statement, which is what F10 does. That is, CALL MySub is handled as
- single statement, and all of the statements within the routine are executed
- as one operation. In other cases, however, you may need to trace into a
- subprogram, GOSUB routine, DEF FN, or function, to step through its
- statements as well. This is what F8 is for. When F8 is pressed at a CALL
- or GOSUB statement or function invocation, BASIC traces into the procedure
- and lets you watch as it executes each statement individually.
- Two additional capabilities let you navigate a program more quickly.
- Pressing F7 tells BASIC to execute all of the statements up to the current
- cursor location. This way, you are spared from having to watch a long
- sequences of commands that you know are working correctly. For example,
- stepping through a FOR/NEXT loop that initializes 1000 elements in an array
- is usually pointless. Therefore, when you reach that spot in the program
- you would manually move the cursor to the statement following the NEXT, and
- press F7.
- It is also possible to force execution to a particular point in the
- program using the "Set next statement" option of the Debug menu. Unlike
- F7, though, the statements that precede the selected line will not be
- executed. Therefore, this option is equivalent to adding a temporary GOTO
- to the program, causing it to jump to the specified line.
- One of the most powerful features of the BASIC editor is that you can
- actually modify your program, then resume execution. In earlier versions
- of QuickBASIC, making even the slightest change to a program--even if only
- to a single comment--the entire program would have to be recompiled. BASIC
- can now preserve variable values and indeed the entire program state during
- most types of editing operations.
- The last important step operation I want to mention now is the History
- feature. This too must be selected from a menu, and using it will slow
- your program's operation considerably. When the History option is selected
- from the Debug menu, BASIC remembers the last 25 program statements, and
- lets you step through your program in reverse. For example, if a variable
- has taken on an incorrect value, you can walk backwards through the program
- to see what statements caused that to happen. Where F8 steps forward
- through your program, Shift-F8 instead steps backward.
-
-
- WATCH VARIABLES AND BREAK POINTS
-
- As powerful as BASIC's single-step feature is, it is only half of the
- story. Equally important is the Watch capability that lets you view a
- program's variables in real time. One or more variables may be placed into
- a special Watch window at the top of the editing screen, and their values
- will be displayed and updated after each statement is executed. Between
- the Step and Watch features, you can observe all aspects of your program's
- operation as it is executing.
- Besides watching variable values, you can also monitor complex
- expressions and function results. For example, you could watch the value
- of X% * Y% + Z%, ASC(Work$), or the result of a function such as
- StrFunction$(Array$(), Count%). Because each variable or expression is
- updated after every program statement, your program will run more slowly
- when many items are displayed in the watch window. However, this is seldom
- a problem in a debugging situation, and the ability to see precisely what
- is happening far outweighs the minor speed penalty.
- Being able to watch the results of expressions as well as simple
- variables offers some useful and interesting techniques. As an example,
- suppose you are watching a string variable named Buffer$. If Buffer$ is
- very long, you can use LEFT$ or MID$ to watch just a portion of the string:
- MID$(Buffer$, CurPointer%, 70). This expression displays the 70-character
- portion of Buffer$ that is currently pointed to by CurPointer% (assuming,
- of course, you are using variables with those names).
- Likewise, if you are observing a string but nothing is showing in the
- watch window, you could watch "{" + Work$ + "}". This displays "{}" if the
- string is null, and shows if there are leading or trailing blanks or
- CHR$(0) bytes. Adding braces also lets you see if the string contains
- characters that begin past the edge of the visible window.
- One particularly powerful use of BASIC's Watch capability is related
- to the fact that all of the expressions are evaluated anew at each
- statement. Earlier I mentioned how insidious "String space corrupt" errors
- can be, because BASIC checks the integrity of its string memory only when a
- string is being assigned. Therefore, watching the expression FRE(Any$)
- tells BASIC to evaluate string memory after every source line. Thus, as
- soon as string memory is corrupted it will be immediately reported. This
- technique can be extended to identify a "Far heap corrupt" error as well,
- by watching the expression FRE(-1).
- Besides the Step and Watch capabilities, there are two additional
- features you should understand: Break Points and Watch Points. When a
- program is very large and complex, it becomes impractical to step and trace
- through every statement. Also, in some cases you may not know at which
- statement an error is occurring.
- Pressing F9 sets up a Break Point which tells BASIC to halt when it
- reaches that point in the program, regardless of how it arrived there. You
- can have multiple break points, and the program will run normally until the
- specified statement is about to be executed. Simply place the cursor on
- the line at which the program is to stop, and press F9. That line will be
- highlighted to show that it is currently a Break Point. Pressing F9 again
- removes the Break Point.
- A Watch Point tells BASIC to execute the program, until a certain
- condition becomes true. Some examples of Watch Points are X% = 100,
- ABS(Total#) > 1000, and FRE("") < 1000. In the first example you are
- telling BASIC to stop the program and return to the editor when X% equals
- 100. The second example will stop the program when the absolute value of
- Total# exceeds 1000, and the third halts it when there are less than 1000
- bytes of string space remaining.
- Considered together, these debugging features are extremely powerful.
- You can tell BASIC, in effect, "Run until the value of Count% hits 14; then
- stop the program, and let me walk backwards through the program to see how
- that happened."
-
-
- USING /D TO DETECT ERRORS
-
- Another very powerful debugging solution at your disposal is to compile
- your program with the /d debug option. When creating an .EXE file in the
- BASIC environment from the Run menu, you would select the "Produce debug
- code" option. Compiling with /d tells BC to add three important safeguards
- to the code it generates. Some of these debugging issues were described in
- Chapter 1, but they deserve elaboration here.
- The first code addition is a call to a central event handler prior to
- every BASIC program statement, to detect if Ctrl-Break was pressed.
- Normally, a compiled BASIC program is immune from pressing Ctrl-Break and
- Ctrl-C, unless the program is processing an INPUT statement. BASIC adds
- break checking to let you get out of an endless loop or other similar
- situation, without having to reboot your computer.
- The second addition is an overflow test following each integer and
- long integer addition, subtraction, and multiplication, to detect results
- that exceed the range of legal values. If you have a statement such as X%
- = Y% * Z% and the result after multiplying is greater than 32767, the
- overflow test will detect that and produce an error message. Otherwise, X%
- would be assigned an erroneous value and your program would have no way to
- detect it. Floating point operations do not need any additional testing,
- because overflows are detected and reported whether or not /d is used.
- The last additional code that BASIC adds when /d is used is array
- element bounds checking. If you have dimensioned an array and attempt to
- assign an element that doesn't exist, a compiled BASIC program will
- normally ignore the error. For example, if an array has been dimensioned
- using DIM Array%(1 TO 100) and you then have the statement Array%(200) =
- 12, BASIC will store the value 12 at what would have been the 200th
- element. This can lead to disastrous consequences such as overwriting an
- element in another array, or corrupting string memory. When /d is used
- BASIC adds additional code to check every array element referenced, and
- reports an error if that element does not exist.
- Because of the added checking for overflow errors and illegal element
- numbers, a program compiled with /d will be larger and run more slowly than
- one in which /d is not used. Therefore, you should not release a program
- for general use that has been compiled with the debug option. One
- exception worth noting is that QuickBASIC versions 4.0 and 4.5 contain a
- bug that generates incorrect code for certain long integer array
- operations. The only solution when that happens is to use /d. This way,
- the routine that calculates element addresses and checks for illegal
- element numbers is used, rather than the incorrect in-line code that BC
- produces directly.
- You could also compile with the /ah (huge array) switch, which uses
- the same routine to calculate and check array element addresses. Using /ah
- has an advantage over /d in this case, because your program will not be
- halted if Ctrl-Break is pressed. Using /ah also avoids the extra code and
- time to check for overflow errors. However, /ah affects dynamic arrays
- only, and errors with static arrays will not be prevented.
- When a program is run in the BASIC editor, the same protection that /d
- provides is employed. This added debug testing within the editor is one
- more contributor to its slowness when compared to a fully compiled program.
-
-
- ADVANCED DEBUGGING
-
- Although being able to step through your program and watch its variables in
- the BASIC editing environment is very powerful, there are still some
- limitations inherent in that process. For example, it is possible that a
- program will work perfectly in the editor, but not when it has been
- compiled to an .EXE program. Microsoft has tried to make the BASIC editor
- as compatible with BC as possible, but the editor is an interpreter and not
- a true compiler. There are bound to be some differences in how the program
- runs. Another limitation is that some programs are just too large to be
- run within the editor. Finally, if you receive an error message from an
- executable program that lists only a segment and address, there is no way
- to determine where the error occurred using the editor.
- In these cases you will need to work with the actual compiled program.
- To relate an error address to the original BASIC source statement you must
- be able to see the assembly language code that BC generates, along with the
- original BASIC source. One way to do this is with the Microsoft CodeView
- debugger. CodeView comes with BASIC PDS [and VB/DOS Professional Edition]
- as well as with Microsoft's Macro Assembler. CodeView provides a debugging
- environment that is similar to the QB editor, except it is intended for
- tracing through a program that has already been compiled.
- Another way is to instruct BC to generate an assembly language source
- listing as it compiles your program. This listing shows a mix of BASIC
- source statements and the resultant assembly language code and addresses.
- However, the listing is not as clear or easy to follow as the display that
- CodeView presents. But if you do not have CodeView, this is your only
- choice. I will describe this method first.
-
-
- CREATING AN ASSEMBLY LANGUAGE SOURCE LISTING
-
- To create an assembly language list file you use the compiler's /a switch,
- and then specify a list file name. The syntax is shown below, followed by
- a sample list file that is generated.
- You enter this:
-
- bc program /a [/other options] , , listfile;
-
-
- LISTFILE.LST contains this:
- PAGE 1
- 25 June 91
- 14:28:08
- Microsoft (R) QuickBASIC Compiler Version 4.50
-
- Offset Data Source Line
-
- 0030 0006 CLS
- 0030 0006 INPUT Count%
- 0030 ** I00002: mov ax,0FFFFh
- 0033 ** push ax
- 0034 ** call B$SCLS
- 0039 ** mov ax,offset <const>
- 003C ** push ax
- 003D ** call 0000h
- 0040 ** pop ax
- 0041 ** add ax,000Dh
- 0044 ** push cs
- 0045 ** push ax
- 0046 ** call B$INPP
- 004B ** jmp $+04h
- 004D ** dw 0002h
- 004F ** db 00h
- 0050 ** db 02h
- 0051 ** mov bx,offset COUNT%
- 0054 ** push ds
- 0055 ** pop es
- 0056 ** push es
- 0057 ** push bx
- 0058 ** call B$RDI2
- 005D 0008 IF Count% < 100 THEN
- 005D 0008 Count% = 100
- 005D 0008 END IF
- 005D ** call B$PEOS
- 0062 ** cmp word ptr COUNT%,64h
- 0067 ** jl $+03h
- 0069 ** jmp I00003
- 006C ** mov COUNT%,0064h
- 0072 0008 PRINT Count%
- 0072 0008 END
- 0072 0008
- 0072 0008
- 0072 ** I00003: push COUNT%
- 0076 ** call B$PEI2
- 007B ** call B$CEND
- 0080 ** call B$CENP
- 0085 0008
-
- 43981 Bytes Available
- 43643 Bytes Free
-
- 0 Warning Error(s)
- 0 Severe Error(s)
- Here, the list file shows the original BASIC source code, as well as the
- generated assembly language instructions. The column at the left holds the
- code addresses, and these correspond to the addresses that BASIC displays
- when a program crashes with an error message. Unfortunately, several BASIC
- statements are grouped together, so it is not immediately apparent which
- address goes with which source statement. For example, after the BASIC
- statement INPUT Count%, the earlier assembly language instructions that
- clear the screen are shown. Similarly, the call to B$PEOS is actually part
- of the INPUT code, although it is listed following the IF test.
- When BASIC displays an error message and ends your program by
- displaying a segmented address, only the address portion is meaningful.
- The segment in which a program is running will depend on many factors,
- including the DOS version (and thus its size), the FILES= and BUFFERS=
- values specified in CONFIG.SYS, and whether TSR programs and device drivers
- are loaded. Each of these factors cause the program to be loaded at a
- higher segment, although the addresses within that segment never change.
- Also, in a multi-module program, a different segment is used for each
- module's source file. Therefore, if the message is "Illegal function call
- in module XYZ at address 3456:1234", you would compile XYZ.BAS to create a
- list file instead of the main program. The code in the vicinity of address
- 1234 will be where the error occurred.
-
-
- USING MICROSOFT CODEVIEW
-
- Although compiling with the /a switch lets you view the assembly language
- code that BASIC creates, there is little you can actually do with that
- information. CodeView is a much more powerful debugging tool, and it lets
- you step through an .EXE file as it is running. This lets you follow the
- compiled program's execution path, and also view its assembly language
- instructions. Further, CodeView can trace into BASIC's library routines,
- as well as calls to C or assembly language routines that you have written.
- CodeView can also be used to see how many bytes of code are generated
- for each BASIC statement. This is a good way to compare the relative
- efficiency of different programming methods, to see which ones produce less
- code. It is important to understand that the size of the assembly language
- code generated for a given BASIC statement is a combination of two factors:
- the number of bytes the compiler generates for each occurrence of the
- statement, and the size of the called routine within BASIC's runtime
- library. Of course, the called routine is added to your program only once.
- However, the code that sets up and calls the routine is added each time the
- statement is encountered.
- Compiling a program for use with CodeView is very simple, and merely
- requires the addition of special compiler and linker option switches. Note
- that you cannot compile a program for CodeView from within the QuickBASIC
- editor; you must compile and link manually from the DOS command line, as
- shown below. Also notice that the BASIC program must be saved as ASCII
- text, and not with the special "Fast Load" method that QB optionally uses.
-
-
- bc program /zi [/other options];
- link program /co [/other options];
- cv program
-
-
- The /zi option tells BC to write additional information into the object
- file, which is used by LINK and CodeView to relate each line of BASIC
- source code to its resultant assembly code. The more meaningfully named
- /co switch is required so LINK will know to do likewise. You may be
- interested to know that /zi is named after Microsoft legend Mark
- Zibikowski, whose initials (MZ) also appear as the first two bytes in every
- DOS .EXE file.
- Once the program has been compiled and linked, start CodeView by
- entering CV followed by the file's first name (that is, without the .BAS or
- .EXE extension). You will then be presented with a screen very similar to
- that of the QB editor. Most versions of CodeView initially show the BASIC
- source code. In other versions, you must press Alt-R-R to "restart" the
- program and bring it to the first source line. I should point out that
- CodeView is a quirky program, and it is often referred to as the program
- that people "love to hate". It has some glaring omissions, many aspects of
- its interface are inconsistent and downright obnoxious, and I personally
- would be lost without it.
- When the BASIC source is displayed, you may press F4, F7, F8, and F10,
- which perform the same functions as their BASIC editor counterparts. One
- important difference, however, is that you may also press F3 to show a mix
- of BASIC and assembly language code. Stepping through the program with F8
- and F10 will execute either a single BASIC statement or a single assembler
- command, depending on the context. That is, if you are in the BASIC view
- mode, then you will step through the BASIC code. If the assembly language
- code is being displayed, then you will step through that instead.
- Figure 4-1 [not available here, sorry] shows a screen snapshot of a
- short sample program as displayed by CodeView when it is first started in
- the BASIC view mode. Figure 4-2 [also unavailable] shows the same program
- after pressing F10 to execute up to the first statement, followed by F3 to
- view a mix of BASIC and assembly language. This screen is in a 50-line
- mode to allow the entire program to be displayed. Although it is not shown
- here, CodeView can continuously display the processor's registers in a
- small window at the right side of the screen. The register display is
- alternately activated and deactivated by pressing F2.
-
-
- FIG4-1: The CodeView display when using the BASIC view mode.
-
-
- FIG4-2: The CodeView display for the same program, but using the assembly
- language view mode.
-
-
- Notice in Figure 4-2 that CodeView displays each BASIC statement indented
- and with a line number. This lets you identify where each BASIC command
- starts, and also which block of assembly language code it is associated
- with. The numbers at the left edge of the display show the segment and
- address of each instruction in hexadecimal notation. The segment value
- never changes within a single program module, although the addresses
- increase based on the number of bytes in each assembly language
- instruction. As you can see, some assembly language commands are as short
- as one byte, and others are as long as six.
- In the first instruction, CLS, a value of -1 (FFFF hex) is passed to
- the CLS routine as a flag to show that no argument was given. Had the
- BASIC statement been CLS 2, then a value of 2 would have been moved into AX
- instead. Nine bytes of code are generated each time CLS is used, not
- counting the code within B$SCLS. Besides showing the B$SCLS routine name,
- CodeView also shows the segment and address at which B$SCLS resides.
- Knowing the routine's address is of little practical use in this situation,
- and it is displayed solely for informational purposes.
- The INPUT statement is fairly complicated to set up, and I won't
- belabor what every assembly language instruction does. But several items
- are worth discussing. The first is that CodeView attempts to relate every
- number it encounters to a variable or procedure address. In many cases
- this is confusing, because some numbers are simply that, and have no
- relationship to a variable or procedure address.
- For example, at address 39 the assembly language command MOV AX,40 is
- shown as MOV AX,b$STRTAB_END+10 (0040), as if there was some significance
- to the fact that the value 40 is an address ten bytes past the end of an
- internal string table. Likewise, two instructions later the value 40 is
- represented as being 31 bytes past the beginning of the B$LENDRW procedure.
- Two instructions past that the value 13 (0D hex) is added to AX, and again
- CodeView tries to establish a significance where none exists.
- In not one of these cases are the values shown related to the named
- address, and you should therefore treat those named labels with skepticism.
- The only symbolic names that are meaningful in most cases are variable and
- procedure names that do not have an extra value added to them. In the
- instruction MOV Word Ptr [COUNT% (0036)],b$HEAP_FIRST (0064) at address 6C,
- the address for Count% (36) is valid, while the value 64 named b$HEAP_FIRST
- is meaningless. In this case, 64 hex represents the value 100 in the BASIC
- statement Count% = 100. Whatever b$HEAP_FIRST may represent, it has no
- meaning here.
- I suggest that you enter this short program and then step through it
- one statement at a time, just to get a feel for how CodeView operates. You
- should also try tracing into some of the BASIC library calls, as well as
- into a simple subprogram or two of your own. Again, you may use either F10
- or F8 to step through the code, but only F8 will trace into code that is
- being called. You can also use F8 to trace into some BIOS interrupts, but
- you should never try to trace through a DOS interrupt (21 hex). Many DOS
- services never return, or return in a non-standard manner, and a locked-up
- PC is the likely result. You will not hurt anything if you do trace into a
- DOS interrupt, but be prepared to press Ctrl-Alt-Del.
- Besides being able to view and step through the assembly language code
- that BASIC creates, you can also view and modify your program's data
- directly. If you have pressed F2 to display the CPU's registers, CodeView
- will show the value currently in every memory address that is about to be
- accessed. For example, if the next statement to be executed is MOV Word
- Ptr [COUNT%],10, CodeView will show the current contents of the variable
- COUNT%.
- A range of memory addresses may be displayed by entering commands into
- the immediate window at the bottom of the screen. When CodeView is first
- started, the cursor is placed at the bottom line in that window. As with
- the BASIC editor, the F6 key is used to toggle between the code output and
- immediate windows. Unlike the BASIC editor, however, you may type commands
- regardless of which window is active.
- The three primary commands you will find useful are D, U, and R. The
- D (Dump) command tells CodeView to display a range of memory, starting at a
- given address. For example, D 0 means to show the 32 bytes that start at
- address 0 in the default data segment. Likewise, D ES:100 means to start
- at address 100 in the segment held in the ES register. Unfortunately,
- CodeView is particularly obtuse in this regard, because in some cases the
- numbers you enter are assumed to be decimal while in others it assumes
- hexadecimal. Which is which depends on your view perspective (selected
- with F3), and I won't even begin to offer a reason or explain the confusing
- rules. If you don't get what you expect, try adding an "&H" prefix to the
- number. And if you start by using &H and CodeView reports a syntax error,
- then try it without the &H.
- When the contents of memory are displayed, they are shown as
- individual bytes, rather than as integer words which is generally more
- useful. In the listing below, two string constants have been displayed in
- response to the command D &H40. For space reasons, the segment and address
- which CodeView adds to the left of each row of values are instead shown
- above the rows.
-
-
- >D &H40
-
- 5676:0040
- 02 00 44 00 48 69 23 00 4A 00 41 42 43 44 45 46
- 5676:0050
- 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56
-
-
- As you learned in Chapter 2, BASIC near strings have a 4-byte descriptor,
- with the first two bytes holding the string's current length, and the
- second two bytes its current address. Beginning with the first two numbers
- displayed, the 02 00 represents the length of a 2-character string, and the
- 44 00 indicates the address which is 44. The data itself is a CHR$(&H48)
- followed by a CHR$(&H61) ("Hi"), and it immediately follows the string
- descriptor. When two bytes are used to store an integer word, the least
- significant byte is kept in the lower memory address. Therefore, the value
- 0002 is actually listed as 02 00 (CodeView adds an extra blank between
- bytes for clarity).
- Immediately following the six bytes for the string "Hi" and its
- descriptor is another descriptor. This one shows that the string has a
- length of 23 Hex bytes, and its data starts at address 4A Hex. Again, the
- value 0023 is shown as 23 00, and the address 004A is displayed as 4A 00.
- This string contains the data "ABCDEFGHIJKLMNOPQRSTUV".
- The U (Unassemble) command can be used to show the assembly language
- source code at any arbitrary segment and address. The command U 2000:1000
- will unassemble the code at address 2000:1000, though again you may need to
- use U &H2000:&H1000 in some view modes. The U command is not used that
- frequently, since CodeView is used most often to step through code in
- sequence, rather than to examine an arbitrary block of instructions.
- The R command lets you change the contents of a register, and this
- might be useful when debugging your own assembly language subroutines.
- When you type, for example, RCX and press Enter, the current value of the
- CX register is displayed and you are prompted for a new value. Pressing
- Enter alone cancels the command and leaves the current register contents
- intact. Otherwise, the value you enter will be assigned to CX. This is
- similar to BASIC's immediate window, in which you can assign new values to
- a variable.
- The last CodeView features worth describing here are Watch Variables
- and Watch Points, which are similar to the same features in QB. Unlike QB,
- though, you cannot use an expression as the target of a Watch; it must be a
- simple variable name, array element, or address. Watch Variables may be
- added using the pull-down menu, or by pressing Alt-W and then typing the
- variable name. If you are in the BASIC view mode you may add only BASIC
- variables; in the assembly language view mode you can add only assembly
- language variables. To monitor the contents of a memory address requires
- the W command. For example, W 40 will set up address 40 as the target of a
- Watch.
- Although CodeView does support Watch points, whereby the program will
- run continuously until a given expression is true, you won't want to use
- that feature. Asking CodeView to stop when, say, CX becomes greater than
- 100 will cause your program to run at less than one thousandth its normal
- speed. Therefore, I have never found using Watch Points effective in any
- situation--it is always too slow.
- I have avoided discussing the latest versions of CodeView, in favor of
- focusing on those features which are common to all versions. CodeView 3.10
- which is included with BASIC 7.1 has several new convenience features, and
- a few new bugs as well. Many of the commands that in earlier versions have
- to be entered manually are now available by simply typing new values onto
- the display. For instance, where older versions of CodeView required you
- to enter Dump commands repeatedly, the new version updates the displayed
- values in a range of addresses constantly. And to change the address
- range, you may now simply move the cursor to the segment and address
- numbers and type new ones. An option to display memory values as words or
- even single and double precision values is also present in version 3.10.
- Now that you have seen what CodeView is all about and how to use it, I
- want to conclude this chapter with a practical example. As I mentioned in
- Chapter 3, the amount of stack memory that is needed in a non-static
- subprogram or function can be difficult to determine. The calculation
- itself is trivial: simply add up the number of bytes needed by every
- variable in the routine. Each integer requires two bytes, single
- precision, long integer, and string variables need four bytes, and so
- forth. The problem, of course, is who wants to do all that counting,
- especially when there may be hundreds of variables. Counting is what
- computers are for, no?
- The solution is that BASIC knows how many bytes are needed for the
- subprogram, and the very first thing a subprogram does when it is invoked
- is to call another routine that allocates the necessary stack space. So
- rather than use trial and error methods to increase the stack in small
- increments, you can use CodeView to directly see how many bytes of stack
- space are being requested. Here's how that's done, using the example
- program shown below.
-
-
- DEFINT A-Z
- DECLARE SUB StackTest (Dummy)
- Test = 10
- CALL StackTest(Test)
- END
-
- SUB StackTest(AnyVar)
- X = 100
- Y = 10
- Z = AnyVar
- END SUB
-
-
- Save this program as an ASCII file using the name TEST.BAS, and then
- compile it with the /o and /zi options. Next, link TEST.OBJ for CodeView
- using the /co option. Then start CodeView by entering CV TEST. Once you
- are in CodeView and viewing the BASIC source, press F10 to skip past
- BASIC's start-up code. At this point the cursor should be on the first
- statement, Test = 10. Finally, press F3 to show a mix of BASIC and
- assembly language source code. The display should look similar to that
- shown in Figure 4-3 [unavailable].
-
-
- FIG4-3: How to determine the amount of stack memory needed for a non-static
- procedure.
-
-
- Notice the first statement within the TestStack subprogram at line 7, where
- the value 6 (erroneously labeled b$STRTAB+6) is assigned to the CX
- register. This is the number of bytes of stack space being requested from
- the B$ENRA routine which is called in the next instruction. B$ENRA is the
- routine that actually allocates the stack memory, and it uses the value
- BASIC sends in CX to know how many bytes are needed. TestStack has three
- local variables and each is a two-byte integer, hence six bytes are
- required to store them on the stack.
- For a very large program, the value assigned to CX will of course be
- much larger. Further, if one subprogram calls another, it will be up to
- you to add up all of the CX values to determine the total stack memory
- requirements. But this is very much easier than counting variables.
-
-
- SUMMARY
-
- In this chapter you have learned how to identify and correct common
- programming errors. You have also learned the importance of understanding
- BASIC's various quirks, and how some statements do not always do exactly
- what you thought they would. I have shown several debugging strategies,
- including a software adaptation of the "cut in half" hardware technique.
- Perhaps your most powerful debugging ally is the QuickBASIC and QBX
- editing environments. These powerful editors let you single step through a
- program, monitor variable values and function results, and halt your
- program when a specified condition occurs.
- When BASIC terminates a program prematurely with an error message and
- a segmented address, you can either use the BC compiler's /a option to
- generate a source listing, or use CodeView to see where the error occurred.
- CodeView can also be used to step and trace through a program at the
- assembly language source level, and to determine the number of bytes of
- stack memory a non-static procedure requires.
- In Chapter 5 you will learn about compiling and linking BASIC
- programs. I will present a complete overview of the many BC and LINK
- options that are available, and discuss the relative merits of each.